Learning Objectives

By the end of this section, students will be able to:

  1. Apply first and second-order conditions for unconstrained optimization problems
  2. Use Lagrange multipliers to solve equality-constrained optimization problems
  3. Understand KKT conditions for inequality-constrained optimization
  4. Solve business optimization problems using analytic and computational methods

R skills

  1. Solve univariable unconstrained optimisation problem using optimize()
  2. solve multivariable unconstrained optimisation problem using optim()
  3. Solve multivariable constrained optimisation problem using solnp()

Contents

  • Unconstrained optimization
    • Univariable cases
    • Multivariable cases
  • Optimization with equality constraints
    • Lagrange method
  • Optimization with inequality constraints
    • KKT conditions

Knowledge Point 1: Unconstrained Optimization

Unconstrained optimization seeks to find the maximum or minimum of a function without any restrictions on the variables. This forms the foundation for understanding more complex constrained optimization problems.

1.1 Univariable Cases

Formulation of The Problem

For a function \(f: \mathbb{R}\to \mathbb{R}\), we want find the minimum or the maximum of the function \(f(x).\) Since a maximisation problem can always ne formulated a minimisation problem of its negation \(\max_{x\in\mathbb{R}}f(x) \Rightarrow \min_{x\in\mathbb{R}}-f(x)\), we will focus on the minimisation problems.

\[\min_{x \in \mathbb{R}} f(x)\]

Definition 1.1 (First-Order Condition): For a function \(f(x)\) to have a local extremum at \(x = a\), it is necessary that: \[f'(a) = 0\]

Definition 1.2 (Second-Order Condition): For a critical point \(x = a\) where \(f'(a) = 0\):

  • If \(f''(a) > 0\), then \(x = a\) is a local minimum
  • If \(f''(a) < 0\), then \(x = a\) is a local maximum
  • If \(f''(a) = 0\), the test is inconclusive

Remark:

Example 5.1: TechStart Pricing Strategy

TechStart Inc. wants to maximize profit from their new software product. Market research shows demand follows \(D(p) = 1000 - 10p\) and production costs are \(C(q) = 5000 + 20q\). The profit function is:

\[\pi(p) = p \cdot D(p) - C(D(p)) = p(1000 - 10p) - (5000 + 20(1000 - 10p))\] \[\pi(p) = 1000p - 10p^2 - 25000 + 200p = 1200p - 10p^2 - 25000\]

To find the optimal price, we apply the first and second-order conditions.

Analytic Solution

For TechStart’s profit function \(\pi(p) = 1200p - 10p^2 - 25000\):

Step 1: First-Order Condition \[\pi'(p) = 1200 - 20p = 0\] \[p^* = 60\]

Step 2: Second-Order Condition \[\pi''(p) = -20 < 0\]

Since \(\pi''(60) < 0\), \(p = 60\) gives a maximum profit.

Step 3: Calculate Maximum Profit \[\pi(60) = 1200(60) - 10(60)^2 - 25000 = 72000 - 36000 - 25000 = 11000\]

Numerical R Solution

profit <- function(p) {
  1200*p - 10*p^2 - 25000
}

optimize(profit, interval = c(0, 200), maximum = TRUE)
## $maximum
## [1] 60
## 
## $objective
## [1] 11000

Exercise

Nonlinear Univariable Optimization Problem

A firm sells a premium product. Market demand decreases nonlinearly with price: \[q(p) = 1500\, e^{-0.04p},\] where \(p \ge 0\) is the price (in dollars) and \(q(p)\) is quantity demanded.

The firm’s total cost consists of a fixed overhead and a per-unit cost proportional to output: \[C(p) = 3000 \;+\; 18\, q(p).\] Revenue at price \(p\) is \(R(p) = p \cdot q(p).\)

Find the profit maximising quantity \(q\).

Complete the task using optimize() in following R Environment or in your local RStudio.

1.2 Multivariable Cases

Formulation of the Problem

For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum.

\[\min_{x \in \mathbb{R}^n} f(x)\]

Definition 1.3 (First-Order Condition): For a function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\), it is necessary that: \[\nabla f(x_0) = 0\]

Definition 1.4 (Second-Order Condition): For a critical point \(x = x_0\) where \(\nabla f(x_0) = 0\): \[H = \nabla^2 f(x_0)\]

  • If \(H\) is positive definite the critical point \(x_0\) is a minimum
  • If \(H\) is negative definite the critical point \(x_0\) is a maximum
  • If \(H\) is indefinite the critical point \(x_0\) is a saddle point

For \(n=2\), i.e. \(f\) is a function of two variables \(f(x,y)\) we have

\[\nabla f(x) = \begin{pmatrix} f_{x} \\ f_{x} \end{pmatrix}\]

Second-Order Conditions: Define the Hessian matrix: \[H = \begin{pmatrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \end{pmatrix}\]

Example 5.2

Sarah owns a coffee shop that sells two main products: coffee and pastries. Her daily revenue depends on both the number of cups of coffee sold \(x\) and the number of pastries sold \(y\). Through market research, she has determined that her revenue function is

\[R(x, y) = 120x + 100y - 2x² - y² - xy\]

How can Sarah maximise her revenue?

Analytic Solution

Step 1: Find Critical Points by FOC

  • \(\nabla f =\begin{pmatrix} \frac{\partial R}{\partial x} \\ \frac{\partial R}{\partial y}\end{pmatrix}=\begin{pmatrix} 120-4x-y \\100 -2y -x \end{pmatrix}=\begin{pmatrix} 0 \\ 0\end{pmatrix}\)

  • Rearrange in matrix form \[\begin{pmatrix} 4 & 1 \\ 1 & 2 \end{pmatrix}\begin{pmatrix} x \\y\end{pmatrix}=\begin{pmatrix} 120 \\100\end{pmatrix} \]

  • Solve the two equation system

A <- matrix(c(4,1,1,2),2,2,byrow=TRUE)
b = c(120,100)
solve(A)%*%b
##      [,1]
## [1,]   20
## [2,]   40

Step 2: Compute eigenvalues of the Hessian

  • \(\frac{\partial^2 R}{\partial x^2} = -4\)
  • \(\frac{\partial^2 R}{\partial y^2} = -2\)
  • \(\frac{\partial^2 R}{\partial x\partial r} = -1\)

\[H = \begin{pmatrix} -4 & -1 \\ -1 & -2 \end{pmatrix}\]

H<- matrix(c(-4,-1,-1,-2),2,2,byrow=TRUE)
eigen(H)$values
## [1] -1.585786 -4.414214

Step 3: Apply Second-Order Test

Since eigen(H)$values < 0, hence the Hessian matrix \(H\) is negative definite, the critical point (20, 40) is a local maximum.

Step 4: Calculate Maximum Profit \[R(20, 40) = 120(20) + 100(40) - 2(20)² - (40)² - (20)(40) = 3200\]

Numerical R Solution

# Define the negative profit function (because optim minimizes by default)
profit <- function(v) {
  x <- v[1]
  y <- v[2]
  - (120*x + 100*y - 2*x^2 - y^2 - x*y)  # Negative for maximization
}

# Use optim to maximize profit
optim(par = c(0, 0), fn = profit)
## $par
## [1] 20 40
## 
## $value
## [1] -3200
## 
## $counts
## function gradient 
##      163       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

Graphic Illustration of the critical point

library(plotly)

A<- matrix(c(-4,-1,-1,-2),2,2,byrow=TRUE)

# Create a grid over [-10, 10]
x <- seq(-0.3, 0.3,0.01 )
y <- seq(-0.3, 0.3,0.01)

z <- outer(x, y, function(x, y) {
  mapply(function(x, y) {
    X <- c(x, y)
    t(X) %*% A %*% X
  }, x, y)
})



plot_ly(x = x, y = y, z = z, type = "surface") |>
  layout(title = "Interactive Surface: z = xáµ€Ax",
         scene = list(
           xaxis = list(title = "dx"),
           yaxis = list(title = "dy"),
           zaxis = list(title = "dz")))

Exercise

A company produces three products: \(x\), \(y\), and \(z\).
The total revenue (in thousands of dollars) is modeled by: \[ R(x,y,z) = 50x^{0.5}y^{0.3}z^{0.2} - 0.5x^2 - 0.4y^2 - 0.6z^2 + 2xy - xz + yz \] where: - \(x, y, z \ge 0\) are the quantities of the three products (in hundreds of units).

Task:
Find the production levels \((x, y, z)\) that maximize revenue.

Instructions: 1. Write the gradient vector \(\nabla R(x,y,z)\) by computing the partial derivatives with respect to \(x\), \(y\), and \(z\). 2. Set \(\nabla R(x,y,z) = 0\) and solve for the critical point(s). 3. Form the Hessian matrix and evaluate it at each critical point. 4. Use the second-order test for multivariable functions to determine whether each point is a local maximum, local minimum, or saddle point. 5. Interpret the optimal production quantities and the maximum revenue.

Solve the problem numerically by optim() to confirm your analytic solution. You can complete the task in following R Environment or in your local RStudio.

Interactive Quiz 5.1


Knowledge Point 2: Equality-Constrained Optimization (Lagrange Multipliers)

GreenInvest, an ESG-focused investment firm, faces a portfolio optimization problem involving the allocation of $100 million across three green tech companies: SolarTech, EcoTransport, and CleanWater. Each offers different returns and risks, requiring the firm to maximize expected returns while strictly adhering to the constraint that total investment equals exactly $100 million.

This constraint transforms the problem from unconstrained optimization to a constrained optimization task. The Lagrange multiplier method is employed to solve it, balancing the objective of return maximization with the investment constraint.

Formal Formulation of the Problem

For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum under \(m\) constraints \(h_i(x)=0\) for \(i = 1,2,...,m\) where each constraint is a multivariable function \(h_i:\mathbb{R}^n \to \mathbb{R}\).

\[\min_{x \in \mathbb{R}^n} f(x)\] \[\text{s.t.} \hspace{0.5cm} h_i(x) = 0, \hspace{0.5cm} i = 1, \dots, m\]

Definition 2.1 (Lagrange Function)

\[\mathcal{L}(x, \lambda) = f(x) + \sum_{i=1}^m \lambda_i h_i(x)\] The multivariable function \(\mathcal{L}(x, \lambda)\) is called Lagrange function.

Definition 2.2 (Lagrange First-Order Condition): For the function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\) under the constraints \(h_i(x_0)=0\), it is necessary that:

\[ \nabla_x \mathcal{L}(x, \lambda) = \nabla f(x) + \sum_{i=1}^m \lambda_i \nabla h_i(x) = 0, \quad h_i(x) = 0 \quad \text{for all } i\]

Definition 2.3 (Lagrange Second-Order Condition): A critical point \(x = x_0\) is a local minimum if the following condition holds

For all nonzero vectors \(d \in \mathbb{R}^n\) such that

\[\nabla h_i(x^*)^\top d = 0 \text{ for all } i = 1, \dots, m, \\[0.5em]\]

\[d^\top \nabla^2_{xx} \mathcal{L}(x^*, \lambda) d > 0.\]

Geometric interpretation of the Lagrange Method:

At the constrained optimum, the surface of the objective along the constraint should at its local optimum. This implies the constraint line must be the tangent of the level line at that optimum point, otherwise along the constraint line the objective function would increase or decrease.

Rise field and Isoquants
Rise field and Isoquants

The fact that the constraint line is the tangent of the level line at the optimum in mathematical term is that the gradient of the objective surface \(\nabla f\) is parallel to the gradient of the constraint \(\nabla g\). This leads to

\[\nabla f = \lambda \nabla g\] Moving all terms to the left hand side we have \[\nabla f - \lambda \nabla g = 0\], and defining it the gradient of a function \(f-\lambda g\), this lead to the Lagrange method: we can find the critical point with the first order condition:

\[ \nabla \mathcal{L}(x) = \nabla f(x) - \lambda \nabla g(x) = 0\] In many cases the constraints are nonlinear, however at the critical point the local behaviour of a nonlinear curve is like the behaviour of the tangent of the constraint at the point. Therefore, the argument above is still valid. In linear cases \(\nabla g(x)\) is a constant vector and in nonlinear cases \(\nabla g(x)\) varies with \(x\).

Rise field and Isoquants
Rise field and Isoquants

Example 2.1: Budget Allocation

A firm wants to allocate its $100000 advertising budget between two channels:

  • Online advertising: $\(x\)
  • TV commercials: $\(y\)

The estimated revenue generated from these investments is: \[ R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] The firm must satisfy the total budget constraint: \[ x + y = 100{,}000 \]

We formulate this business problem as an maximisation problem subject to the equality constraint: \[ \max_{x, y} \quad R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] \[ \text{subject to } x + y = 100000 \]

Lagrangian Method

We form the Lagrangian: \[ \mathcal{L}(x, y, \lambda) = 200\sqrt{x} + 300\sqrt{y} - \lambda(x + y - 100{,}000) \]

First-Order Conditions (FOCs)

\[\begin{align*} \frac{\partial \mathcal{L}}{\partial x} &= \frac{100}{\sqrt{x}} - \lambda = 0 \quad (1) \\ \frac{\partial \mathcal{L}}{\partial y} &= \frac{150}{\sqrt{y}} - \lambda = 0 \quad (2) \\ \frac{\partial \mathcal{L}}{\partial \lambda} &= x + y - 100{,}000 = 0 \quad (3) \end{align*}\]

From equations (1) and (2): \[ \frac{100}{\sqrt{x}} = \frac{150}{\sqrt{y}} \Rightarrow \frac{\sqrt{y}}{\sqrt{x}} = \frac{3}{2} \Rightarrow \frac{y}{x} = \left(\frac{3}{2}\right)^2 = \frac{9}{4} \Rightarrow y = \frac{9}{4}x \]

Substitute into the budget constraint: \[ x + \frac{9}{4}x = 100{,}000 \Rightarrow \frac{13}{4}x = 100{,}000 \Rightarrow x = \frac{400{,}000}{13} \approx 30{,}769.23 \] \[ y = \frac{9}{4}x = \frac{900{,}000}{13} \approx 69{,}230.77 \]

Second-Order Condition (SOC)

We verify the SOC using the bordered Hessian approach. Note that the constraint is linear and the second order derivative of linear function is zero. We have \[\nabla^2 \mathcal{L}(x,\lambda)_{x,y} = \nabla^2 R(x,\lambda)_{x,y}+\nabla^2 \lambda(x+y-100000)=\nabla^2 R(x,\lambda)_{x,y}\] We need only to calculate Hessian of \(R(x,y\) for SOC.

The second derivatives are: \[ \frac{\partial^2 R}{\partial x^2} = -100x^{-3/2} < 0, \quad \frac{\partial^2 R}{\partial y^2} = -150y^{-3/2} < 0 \]

Hessian matrix: \[ H = \begin{bmatrix} -100x^{-3/2} & 0 \\ 0 & -150y^{-3/2} \end{bmatrix} \]

Constraint gradient: \[ \nabla g = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

Tangent space vectors satisfy: \[ d = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \]

Evaluate the SOC: \[ d^\top \mathcal{L}_{x,y}d = d^\top R_{x,y}d = d^\top H d = -100x^{-3/2} - 150y^{-3/2} < 0 \]

Therefore, the Hessian is negative definite on the tangent space \(\Rightarrow\) local maximum.

Determine the maximum revenue

\[R(x^*,y^*)=200\sqrt{30769.23}+300\sqrt{69230.77}=114017.5\]

Numerical R solution

library(Rsolnp)

# Objective function to maximize — note: Rsolnp minimizes by default, so we negate it
fn <- function(par) {
  x <- par[1]
  y <- par[2]
  return(-(200 * sqrt(x) + 300 * sqrt(y)))  # negative for maximization
}

# Equality constraint function: x + y = 100000
eqfun <- function(par) {
  return(sum(par))
}

# Equality bound
eqB <- 100000

# Initial guess
start <- c(50000, 50000)

# Solve with Rsolnp
res <- solnp(pars = start,
             fun = fn,
             eqfun = eqfun,
             eqB = eqB,
             LB = c(0, 0))  # lower bounds for x and y
## 
## Iter: 1 fn: -114017.5423  Pars:  30763.47320 69236.52680
## Iter: 2 fn: -114017.5423  Pars:  30763.47320 69236.52680
## solnp--> Completed in 2 iterations
# Extract solution
opt_x <- res$pars[1]
opt_y <- res$pars[2]
opt_value <- -res$values[length(res$values)]  # reverse the negation
#cat("Optimal x:", opt_x, "\n")
#cat("Optimal y:", opt_y, "\n")
#cat("Maximum Revenue:", opt_value, "\n")

Interactive Visualization

r_sqrt_surface_with_constraint_plane_contour_plot

Interactive Quiz 5.2


Knowledge Point 3:Optimisation with Equlity and Inequality Constraints (KKT Conditions)

Problem Formulation

Definition 3.1 (KKT Conditions): For the problem: \[\min f(x) \quad \text{subject to} \quad g_i(x) \leq 0, \quad h_j(x) = 0\]

The Karush-Kuhn-Tucker conditions are:

  1. Stationarity: \(\nabla f(x^*) + \sum_i \mu_i \nabla g_i(x^*) + \sum_j \lambda_j \nabla h_j(x^*) = 0\)
  2. Primal feasibility: \(g_i(x^*) \leq 0\), \(h_j(x^*) = 0\)
  3. Dual feasibility: \(\mu_i \geq 0\)
  4. Complementary slackness: \(\mu_i g_i(x^*) = 0\)

Economic Interpretation: The multiplier \(\mu_i\) represents the shadow price of constraint \(i\). If \(\mu_i > 0\), the constraint is binding (active). If \(\mu_i = 0\), the constraint is not binding.

Example 5.3 Media plan under cuts and reach compliance

\[ \min_{x,y}\; f(x,y)=(x-5)^2+(y-5)^2 \]

Constraints.
\[ x+y=3, \] \[ x+2y\ge 5. \]

First-order conditions(KKT) Write the standard constraints \(h(x,y)=x+y-3=0\) and \(g(x,y)=5-(x+2y)\le 0\).

Lagrangian: \[ \mathcal L(x,y,\lambda,\mu)=(x-5)^2+(y-5)^2+\lambda\,h(x,y)+\mu\,g(x,y),\quad \mu\ge 0. \]

Stationarity: \[ \partial_x\mathcal L=2(x-5)+\lambda-\mu=0,\qquad \partial_y\mathcal L=2(y-5)+\lambda-2\mu=0. \] Feasibility: \(x+y=3,\; x+2y\ge 5\).

Dual feas./slackness: \(\mu\ge 0,\; \mu\,g(x,y)=0\).

Because the unconstrained projection onto \(x+y=3\) is \((1.5,1.5)\) and violates \(x+2y\ge 5\), the inequality is active at optimum: \(x+2y=5\). Solving \[ x+y=3,\quad x+2y=5 \;\Rightarrow\; (x^*,y^*)=(1,2). \] Multipliers from stationarity: \[ \lambda-\mu=8,\quad \lambda-2\mu=6 \;\Rightarrow\; (\lambda^*,\mu^*)=(10,2). \]

Second-order condition (SOC)

\(\nabla^2 f=2I\succ 0\) (strictly convex objective) and the constraints are affine, so the KKT point is the unique global minimizer.

Managerial interpretation

  • Constrained Optimum: is reached at (1,2), which is the closest feasible point to the unconstrained minimum at (5,5).
  • Shadow prices: \(\lambda^*=10\) \(\mu^*=2\). Implies:
    • +an increase of the equality constraint by 1 would reduce squared deviation by the objective by 10.
    • Relaxing the inequality constrain by 1 would reduce the objective by 2.

Numerical R Solution

# install.packages("Rsolnp") # if needed
library(Rsolnp)

# Objective
fn <- function(w){
  x <- w[1]; y <- w[2]
  (x - 5)^2 + (y - 5)^2
}

# Equality: x + y = 3
eqf <- function(w){
  w[1] + w[2]
}

# Inequality: x + 2y >= 5  -> return x+2y with lower bound 5
ineqf <- function(w){
  w[1] + 2*w[2]
}

ini <- c(0, 0)

res <- solnp(
  pars   = ini,
  fun    = fn,
  eqfun  = eqf,   eqB = 3,
  ineqfun = ineqf, ineqLB = 5, ineqUB = 1e6,
  LB = c(-1e6, -1e6), UB = c(1e6, 1e6)
)
## 
## Iter: 1 fn: 25.0000   Pars:  1.00000 2.00000
## Iter: 2 fn: 25.0000   Pars:  1.00000 2.00000
## solnp--> Completed in 2 iterations
res$pars      # should be ~ c(1, 2)
## [1] 1 2
res$values
## [1] 50 25 25
## --- KKT multipliers from stationarity ---
x <- res$pars[1]; y <- res$pars[2]

# Stationarity system:
# 2(x-5) + lambda - mu = 0
# 2(y-5) + lambda - 2*mu = 0
A <- rbind(c(1, -1), c(1, -2))
b <- c(-2*(x - 5), -2*(y - 5))
sol <- solve(A, b)
lambda <- sol[1]; mu <- sol[2]

list(x = x, y = y, lambda = lambda, mu = mu,
     eq_ok = abs(x + y - 3) < 1e-8,
     ineq_val = x + 2*y, ineq_active = abs(x + 2*y - 5) < 1e-8)
## $x
## [1] 1
## 
## $y
## [1] 2
## 
## $lambda
## [1] 10
## 
## $mu
## [1] 2
## 
## $eq_ok
## [1] TRUE
## 
## $ineq_val
## [1] 5
## 
## $ineq_active
## [1] FALSE

Interactive Visualization

Exercise

A manufacturer produces two standard products, \(x\) and \(y\), and one premium product, \(z\).
The profit function (in thousands of dollars) is given by: \[ \pi(x,y,z) = 40x^{0.5}y^{0.3}z^{0.2} - 0.4x^2 - 0.3y^2 - 0.5z^2 \] where \(x, y, z \ge 0\) are production quantities (in hundreds of units).

The firm faces the following constraints:

  1. Resource constraint (equality)
    The total labor used by all products must equal 2400 hours: \[ 4x + 6y + 8z = 2400 \] where the coefficients represent labor hours per hundred units.

  2. Market constraint (inequality)
    The premium product’s output cannot exceed 60% of the total production: \[ z \le 0.6(x + y + z) \]

Task: 1. Write down the Lagrangian function for the problem, introducing a Lagrange multiplier \(\lambda\) for the equality constraint and \(\mu\) for the inequality constraint. 2. State the Karush–Kuhn–Tucker (KKT) conditions. 3. Solve the first-order conditions along with the constraints to find the optimal \((x^*, y^*, z^*)\), the associated multipliers, and the maximum profit. 4. Interpret the economic meaning of the Lagrange multipliers \(\lambda\) and \(\mu\).

Hint: Remember that for the inequality constraint \(g(x,y,z) \le 0\), the KKT complementary slackness condition requires \(\mu \cdot g(x,y,z) = 0\) and \(\mu \ge 0\).

Solve the problem by solnp() in following R Environment or in your local RStudio.

Interactive Quiz 5.3


Knowledge Point 4: Business Applications

Comprehensive Optimization Examples

Real business optimization problems often combine multiple concepts: unconstrained optimization for initial analysis, equality constraints for budget limitations, and inequality constraints for practical restrictions.

Example 4.1: Multi-Product Pricing Strategy

Problem Statement

A firm chooses production levels of two products, \(x\) and \(y\). Profit is given by

\[ \Pi(x,y) = 120x + 100y - 2x^2 - y^2 - xy . \]

This is an unconstrained maximisation problem.


Analytical Solution

First-Order Conditions (FOC)

\[ \frac{\partial \Pi}{\partial x} = 120 - 4x - y = 0, \quad \frac{\partial \Pi}{\partial y} = 100 - x - 2y = 0 \]

Solving:

\[ x^* = 20, \quad y^* = 40 \]

Second-Order Conditions (SOC)

The Hessian is

\[ H = \begin{bmatrix} -4 & -1 \\ -1 & -2 \end{bmatrix} \]

\(\det(H) = 7 > 0\) and \(-4 < 0\) ⇒ \(H\) is negative definite ⇒ local (and here global) maximum.


Numerical Solution in R

# Profit function
Pi <- function(v) {
  x <- v[1]; y <- v[2]
  120*x + 100*y - 2*x^2 - y^2 - x*y
}

# Negate because optim() minimizes
negPi <- function(v) -Pi(v)

# Gradient of -Pi
grad_negPi <- function(v) {
  x <- v[1]; y <- v[2]
  gx <- -(120 - 4*x - y)
  gy <- -(100 - x - 2*y)
  c(gx, gy)
}

start <- c(0, 0)
fit <- optim(
  par = start,
  fn = negPi,
  gr = grad_negPi,
  method = "BFGS",
  control = list(reltol = 1e-12)
)

fit$par       # x*, y*
## [1] 20.00000 39.99998
-Pi(fit$par)  # Profit at optimum
## [1] -3200

Contour Plot

Interactive 3D Surface Plot

Example 4.2: Portfolio Selection with Two Assets

Problem Setup (component form)

We form a portfolio of two ASX stocks: CBA.AX and BHP.AX. Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget constraint \(w_1 + w_2 = 1\). We maximize mean–variance utility with risk aversion \(\gamma>0\): \[ \max_{w_1,w_2}\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\Big[\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\Big]\]

\[ \text{s.t. } w_1+w_2=1.\]

Here \(\hat{\mu}_i\) are sample mean returns, and \[ \hat{\Sigma}= \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\ \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix} \quad\text{is the sample covariance matrix.} \]


Data: Monthly Prices (last 5 years) and Empirical Moments

Problem setup (data-driven two-asset portfolio)

We form a portfolio of two ASX stocks: CBA.AX (Commonwealth Bank) and BHP.AX (BHP Group).
Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget constraint \(w_1 + w_2 = 1\).

We will estimate monthly log-return means and covariances from the last 5 years of data, then solve a mean–variance utility maximisation problem.


Data and empirical estimation

Let \(P_{i,t}\) be the adjusted closing price of asset \(i \in \{1,2\}\) at month \(t\) (with \(i=1\) for CBA, \(i=2\) for BHP). Define monthly log returns \[ r_{i t} \;=\; \ln\!\left(\frac{P_{i,t}}{P_{i,t-1}}\right), \qquad t=1,\dots,T. \]

Sample mean (monthly) \[ \hat{\mu} = \begin{bmatrix} \hat{\mu}_1\\[2pt] \hat{\mu}_2 \end{bmatrix} = \begin{bmatrix} \frac{1}{T}\sum_{t=1}^T r_{1t}\\[6pt] \frac{1}{T}\sum_{t=1}^T r_{2t} \end{bmatrix}. \]

Sample covariance (monthly) \[ \hat{\Sigma} = \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt] \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix}, \quad \hat{\sigma}_1^2=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)^2,\; \hat{\sigma}_2^2=\frac{1}{T-1}\sum_{t=1}^T (r_{2t}-\hat{\mu}_2)^2,\; \hat{\sigma}_{12}=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)(r_{2t}-\hat{\mu}_2). \]

(Optional annualisation: multiply means by \(12\) and standard deviations by \(\sqrt{12}\).)


Optimisation problem

Given risk aversion \(\gamma>0\), choose \((w_1,w_2)\) to maximise mean–variance utility: \[ \max_{w_1,w_2}\;\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 \;-\;\frac{\gamma}{2}\left( \hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2 \right) \quad\text{s.t.}\quad w_1 + w_2 = 1. \]

(No-shorting \(0\le w_i\le1\) can be added later; here we keep only the equality constraint.)


Analytical solution (equality constraint only)

Lagrangian \[ \mathcal{L}(w_1,w_2,\lambda) = \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\!\left(\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\right) +\lambda(1-w_1-w_2). \]

First-order conditions (FOC) \[ \frac{\partial \mathcal{L}}{\partial w_1}:\quad \hat{\mu}_1 - \gamma(\hat{\sigma}_1^2 w_1 + \hat{\sigma}_{12} w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial w_2}:\quad \hat{\mu}_2 - \gamma(\hat{\sigma}_{12} w_1 + \hat{\sigma}_2^2 w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial \lambda}:\quad w_1 + w_2 = 1. \]

Second-order condition (SOC)
The Hessian with respect to \((w_1,w_2)\) is \[ H = -\gamma \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt] \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix}. \] Since \(\hat{\Sigma}\succ 0\) and \(\gamma>0\), we have \(H\prec 0\) (negative definite), so the FOC point is a unique global maximiser.


Solving step-by-step

  1. Subtract the two FOCs to eliminate \(\lambda\):

  2. Substitute \(w_2 = 1 - w_1\) and solve for \(w_1\): \[ \boxed{ w_1^* = \frac{(\hat{\mu}_1 - \hat{\mu}_2)\;+\;\gamma(\hat{\sigma}_2^2 - \hat{\sigma}_{12})} {\gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big)} }, \qquad \boxed{\,w_2^* = 1 - w_1^*\, .} \]

\[ \boxed{ \lambda = \frac{ \big(\hat{\sigma}_2^2 - \hat{\sigma}_{12}\big)\hat{\mu}_1 \;+\; \big(\hat{\sigma}_1^2 - \hat{\sigma}_{12}\big)\hat{\mu}_2 \;-\; \gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big) }{ \hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12} } } \]

The denominator equals \(\mathrm{Var}(r_{1}-r_{2})=\hat{\sigma}_1^2+\hat{\sigma}_2^2-2\hat{\sigma}_{12}\), which is positive unless the assets are perfectly linearly dependent.

R code for empirical data extraction

# install.packages(c("tidyquant","dplyr","tidyr","lubridate"), dependencies = TRUE)
library(tidyquant)
library(dplyr)
library(tidyr)
library(lubridate)

symbols <- c("CBA.AX", "BHP.AX")
start   <- floor_date(Sys.Date() - years(5), unit = "month")
end     <- Sys.Date()

# 1) Download daily prices, then compute monthly log returns from adjusted prices
prices <- tq_get(symbols, from = start, to = end, get = "stock.prices")

monthly_ret <- prices %>%
  group_by(symbol) %>%
  tq_transmute(
    select     = adjusted,
    mutate_fun = periodReturn,
    period     = "monthly",
    type       = "log",
    col_rename = "ret"
  ) %>%
  ungroup()

# 2) Wide matrix of returns: rows = months, columns = symbols
R <- monthly_ret %>%
  select(symbol, date, ret) %>%
  pivot_wider(names_from = symbol, values_from = ret) %>%
  arrange(date) %>%
  drop_na()

# 3) Empirical moments (monthly log returns)
r_mat <- as.matrix(R[, symbols])
mu    <- colMeans(r_mat)   # vector c(mu_CBA, mu_BHP)
Sigma <- cov(r_mat)        # 2x2 covariance matrix

# Print results
mu
##      CBA.AX      BHP.AX 
## 0.018113866 0.009845225
Sigma
##              CBA.AX       BHP.AX
## CBA.AX 0.0036015332 0.0001531928
## BHP.AX 0.0001531928 0.0047672853

R code for constrained optimization

library(Rsolnp)
# Three-asset portfolio optimization
mu <- c(0.08, 0.12)  # Expected returns
Sigma <- matrix(c(0.04, 0.01,
                  0.01, 0.09), nrow = 2)  # Covariance matrix


Omega <- Sigma
R<-mu
n=2
w = c(1/n,n)

fn1=function(w)
{
  -t(R)%*%w + t(w)%*%Omega%*%w     ### for gamma = 2
}

### the constraint that the weights sum to 1
eqn1=function(w){
  return(sum(w))
}

### the initial value for numerical procedure 
ini = c(0.5,0.5)
solution=solnp(ini, fun = fn1, eqfun = eqn1, eqB = c(1),LB=rep(0,n),UB=rep(1,n))
## 
## Iter: 1 fn: -0.06273  Pars:  0.54545 0.45455
## Iter: 2 fn: -0.06273  Pars:  0.54545 0.45455
## solnp--> Completed in 2 iterations
### the optimal values
solution$par
## [1] 0.5454546 0.4545454
fn1(solution$par)
##             [,1]
## [1,] -0.06272727
sum(solution$par)
## [1] 1

Interactive Quiz 5.4


Knowledge Point 5: Solution Methods in R

Computational Tools for Optimization

R provides several powerful tools for solving optimization problems. Understanding when and how to use each tool is crucial for practical problem-solving.

R Optimization Functions Overview

1. optim() - General-purpose optimization

  • Best for: Unconstrained problems, simple constraints
  • Methods: BFGS, L-BFGS-B, Nelder-Mead, CG
  • Advantages: Flexible, handles multiple variables
  • Limitations: No direct constraint handling

2. constrOptim() - Constrained optimization

  • Best for: Linear inequality constraints
  • Method: Barrier method with penalty functions
  • Advantages: Handles constraints directly
  • Limitations: Only inequality constraints, can be slow

3. lpSolve::lp() - Linear programming

  • Best for: Linear objective and constraints
  • Advantages: Very fast, guaranteed global optimum
  • Limitations: Only linear problems

4. quadprog::solve.QP() - Quadratic programming

  • Best for: Quadratic objective, linear constraints
  • Advantages: Efficient for portfolio optimization
  • Limitations: Requires specific problem structure
Function / Package Best For Constraints Supported Advantages Limitations
optim() Unconstrained problems or simple box constraints None, box ("L-BFGS-B") Flexible, built-in, multiple methods (BFGS, CG, Nelder-Mead, etc.) No general equality/inequality constraints
constrOptim() Problems with linear inequality constraints Linear inequality Direct constraint handling, built into R No equality constraints, slower for large problems
lpSolve::lp() Linear objective with linear constraints Linear equality & inequality Very fast, guaranteed global optimum Only linear problems
quadprog::solve.QP() Quadratic objective with linear constraints Linear equality & inequality Efficient for mean–variance portfolio optimisation Requires positive-definite quadratic term, linear constraints only
Rsolnp::solnp() Nonlinear objective with equality/inequality constraints Nonlinear equality & inequality + bounds Handles general nonlinear constraints; supports bounds Needs good starting values; may converge to local optima
nloptr::nloptr() Large/complex nonlinear optimisation Nonlinear equality & inequality + bounds Many algorithms (global & local), gradient-free or gradient-based Algorithm choice/tuning is critical
DEoptim::DEoptim() Global optimisation, non-smooth or multi-modal problems Bounds only Robust to local minima; no gradient required Slower for smooth problems; needs tuning population size & iterations
GenSA::GenSA() Global optimisation over complex search spaces Bounds only Works for discrete or rugged landscapes; automatic annealing schedule Can be slow; stochastic results vary
optimx::optimx() Comparing multiple local optimisation methods Same as optim() methods used Simple interface to try several methods and compare results Constraint support limited to methods used

Best Practices

  1. Start simple: Use analytical solutions when possible
  2. Check feasibility: Ensure initial points satisfy constraints
  3. Verify results: Compare multiple methods
  4. Scale variables: Normalize for numerical stability
  5. Set appropriate tolerances: Balance accuracy and computation time

Interactive Quiz 5.5


Summary and Key Takeaways

Core Concepts Mastered

  1. Unconstrained Optimization: FOC and SOC provide necessary and sufficient conditions for finding extrema
  2. Lagrange Multipliers: Enable optimization with equality constraints, with multipliers representing shadow prices
  3. KKT Conditions: Extend optimization to inequality constraints, crucial for real-world business problems
  4. Business Applications: Optimization appears in pricing, production, portfolio management, and resource allocation
  5. Computational Methods: R provides multiple tools, each suited for different problem types

Business Relevance

Optimization is fundamental to business decision-making: - Strategic Planning: Resource allocation and capacity planning - Financial Management: Portfolio optimization and risk management
- Operations: Production scheduling and supply chain optimization - Marketing: Pricing strategies and budget allocation

Mathematical Foundation

The progression from unconstrained to constrained optimization mirrors the evolution from theoretical models to practical business applications. Understanding both the mathematical theory and computational implementation enables effective problem-solving in complex business environments.


This concludes Section 5: Optimization. Students should now be equipped to tackle real-world optimization problems using both analytical and computational approaches.